Modeling Heart Disease Mortality in Space and Time

In Waco and Surrounding Counties

Dusty Turner

Motivating Question

What factors impact heart disease mortality in the United States?

According to the CDC:

Heart Disease has been the leading cause of death from 2009-20191

  1. There has been a 21.3/100,000 person drop in heart disease fatalities (182.8 to 161.5) over the last decade.
  2. Males were nearly twice as likely as females to die of heart disease.
  3. Heart disease death rates are highest among Black, non-Hispanics and lowest among Asian and Pacific Islanders.

For perspective, during the height of COVID, it accounted for 1 in 8 deaths (or 697,000 deaths) in the United States. Heart disease was the number one cause of death, followed by cancer, with these two causes of death accounting for a total of 2.15 million deaths.2

Other Factors

Other research has shown other factors to impact heart disease fatalities:

  1. Personal diet: Ulbricht and Southgate (1991)
  2. Stress: Bunker et al. (2003)
  3. Sleep quality and duration: Lao et al. (2018)
  4. Hereditary factors: Bates et al. (2003)
  5. Culture: Syme, Hyman, and Enterline (1964)

We will look at

Space

Time

Age

Gender

Ozone

Housing

Violent Crime

Explore the data

Heart Disease Deaths: Time

Heart Disease Deaths: Gender and Age

Heart Disease Deaths: Ozone Coverage

Mean estimated 8-hour average ozone concentration in parts per billion (ppb) within 3 meters of the surface of the earth

Heart Disease Deaths: Housing

Heart Disease Deaths: Violent Crimes

Pre Modeling Decisions

  1. Training data: 2001-2011
  2. Testing data: 2011-2016
  3. Locations Considered: McClennan and surrounding counties

Modeling Steps

  1. Fit Multivariate Spatial Models
    – Linear Regression
    – Spatial Model, no nugget effect {bbstdr}
    – Spatial Model, nugget effect {spBayes} and {INLA}

  2. Analyze Models
    – Error Metrics
    – Covariate Significance
    – Acknowledge Computational Time

  3. Fit Spatio-Temporal Models
    – Spatio-temporal model fitting with {spTimer}
    – Autoregressive model fitting with {INLA}

Independent Error Regression Model

\(y|x ∼ N(\beta_0 + \beta_i(x), \sigma^2)\)

\(\beta_i \sim N(0,1000)\)

\(\sigma^2 \sim Gamma(\alpha = 2,\beta = 1)\)

Params mean 2.5% 97.5% sd Sig
(Intercept) −74.7 −25,912.3 25,763.0 13,173.9
35_64 −748.3 −23,123.1 21,626.5 11,408.3
65_and_older 673.6 −21,701.2 23,048.5 11,408.3
men 108.8 −22,266.0 22,483.6 11,408.3
women −183.5 −22,558.3 22,191.3 11,408.3
ozone 1.1 −12.3 14.4 6.8
hcp 0.6 −4.9 6.0 2.8
hpi −0.7 −1.0 −0.4 0.2 *
crime_rate 2.1 1.6 2.6 0.2 *
sigma2 34,704.3 29,651.5 40,603.5 2,796.6

Spatial Model; no Nugget Effect

{bmstdr}

\(Y \sim N_n(X\beta,\sigma^2_ϵH)\)

\(H_{ij} = exp(−\phi d_{ij})\)

\(d_{ij}\): \(s_i\) to \(s_j\)

\(\phi=0.9\)

Params mean 2.5% 97.5% sd Sig
(Intercept) −73.0 −26,027.4 25,881.5 13,233.4
35_64 −750.3 −23,226.3 21,725.6 11,459.8
65_and_older 677.6 −21,798.4 23,153.5 11,459.8
men 110.8 −22,365.2 22,586.7 11,459.8
women −183.7 −22,659.6 22,292.3 11,459.8
ozone 1.3 −11.9 14.4 6.7
hcp 0.1 −5.2 5.5 2.7
hpi −0.7 −1.0 −0.4 0.2 *
crime_rate 2.1 1.6 2.6 0.2 *
sigma2 35,018.8 29,920.2 40,971.4 2,821.9

Spatial Model; Nugget Effect

{spBayes}

\(Y \sim N(X\beta, \sigma^2_{\epsilon}I+\tau^2_{\omega}S_{\omega})\)

\(S_{\omega} \sim exp(−\phi d_{ij})\)

\(\phi \sim Unif(a,b)\)

\(\sigma^2 \sim Gamma(\alpha_\sigma,\beta_\sigma)\)

\(\tau^2 \sim Gamma(\alpha_\sigma,\beta_\sigma)\)

Params mean 2.5% 97.5% sd Sig
X.Intercept. −9.3 −199.9 179.8 97.8
35_64 −700.8 −838.3 −561.2 71.2 *
65_and_older 689.0 548.4 826.4 70.6 *
men 135.8 −0.2 273.4 70.2
women −149.9 −287.9 −15.8 70.1 *
ozone −1.3 −8.2 5.8 3.6
hcp 0.9 −4.3 6.0 2.6
hpi −0.7 −1.0 −0.4 0.2 *
crime_rate 2.0 1.6 2.4 0.2 *
sigma.sq 1.1 0.2 4.8 2.4
tau.sq 35,617.3 30,295.5 41,940.1 3,079.3
phi 0.9 0.0 1.9 0.6

Spatial Model; Nugget Effect

{INLA}

\(Y \sim N(X\beta, \sigma^2_{\epsilon}I+\tau^2_{\omega}S_{\omega})\)

\(S_{\omega} \sim exp(−\phi d_{ij})\)

\(\phi \sim Unif(a,b)\)

\(\sigma^2 \sim Gamma(\alpha_\sigma,\beta_\sigma)\)

\(\tau^2 \sim Gamma(\alpha_\sigma,\beta_\sigma)\)

Params mean 2.5% 97.5% sd Sig
X.Intercept. 0.4 −61.6 62.9 31.6
35_64 −307.9 −370.5 −244.7 32.0 *
65_and_older 306.4 244.4 368.9 31.6 *
men 63.4 8.2 118.3 27.9 *
women −63.3 −119.0 −8.3 28.5 *
ozone −1.7 −13.3 9.6 5.9
hcp 1.0 −11.5 13.3 6.2
hpi −0.7 −1.4 0.0 0.4
crime_rate 2.0 1.1 2.9 0.5 *
phi 1.0 0.4 2.4 0.6
sigmasq 0.0 0.0 0.0 0.0
tausq 209,301.1 168,267.5 248,651.5 20,436.8

Compare Models: Cross Validation

Model Scoring

model rmse mae cvg Comp_Time Sig
spBayes 206.1 178.5 97.9 300 35_64, 65_and_older, women, hpi, crime_rate
none 221.3 194.3 93.6 10 hpi, crime_rate
no nugget 238.6 204.5 89.3 17 hpi, crime_rate
inla 270.9 237.5 100.0 36 35_64, 65_and_older, men, women, crime_rate

Move forward with {spBayes}

Location
Time
Age
Gender
Housing Change Percent Crime Rate

Spatio Temporal

Inseperable Spatio-temporal Model

\[Y_{nt} \sim N(X_{nt}\beta, \tau_{\omega}^2A_{nt}S_{\omega}A_{nt}'+ \sigma_{\epsilon}^2I)\] \(t = 1,...,T\)
\(Y_{nt}\): Observations
\(X_{nt}\): Covariate values
\(n_t\): locations at time \(t\)

\[A_t = C_tS^{-1}_{\omega}\] where \(S_{\omega}\) is \(m \times m\): Gaussian process
\(C_t\) is \(n_t \times m\) with the \(j\)th row and \(k\)th column entry \(exp(−\phi|s_j−s^*_k|)\) for \(j=1,...,n_t\) and \(k=1,...,m\).

\(C_t\) captures the cross-correlation between the observation locations at time \(t\) and the \(m\) locations, \(s^∗_k\) \(k=1,...,m\)

\(\beta \sim N(0,1000)\)
\(\sigma^2_{\epsilon} \sim Gamma(\alpha_\sigma,\beta_\sigma)\)
\(\tau^2_{\omega} \sim Gamma(\alpha_\tau,\beta_\tau)\)
\(\phi \sim Unif(0,T)\)

Spatio-temporal Model

{spTimer}
\(\sigma^2_{\epsilon}\) random effect error
\(\sigma^2_{\eta}\) spatial-temporal error
\(\phi\) decay parameter

Params mean 2.5% 97.5% sd Sig
(Intercept) −30.8 −205.3 131.9 85.0
35_64 −709.1 −830.3 −573.5 65.5 *
65_and_older 678.7 557.8 807.0 65.4 *
men 130.6 −1.0 260.8 66.0
women −158.9 −292.1 −22.6 66.9 *
hpi −0.7 −1.0 −0.4 0.2 *
crime_rate 2.0 1.6 2.3 0.2 *
sig2eps 0.0 0.0 0.0 0.0
sig2eta 35,846.7 30,135.2 41,781.6 2,974.1
phi 2.4 0.9 5.5 1.2

Inseperable Spatio-temporal Model

{INLA}
\(\sigma^2_{\epsilon}\) random effect error
\(\sigma^2_{\eta}\) spatial-temporal error
\(\phi\) decay parameter

Params mean 2.5% 97.5% sd Significant
35_64 −306.1 −368.2 −247.8 30.4 *
65_and_older 303.4 248.2 362.5 28.8 *
men 61.0 6.3 120.4 28.4 *
women −64.0 −119.6 −9.7 27.7 *
hpi −0.8 −1.4 0.0 0.4
crime_rate 1.9 1.6 2.2 0.1 *
sigma2eps 218,153.2 167,225.3 277,972.3 27,650.7
sig2eta 2.1 0.6 4.7 1.1
phi 23.2 12.3 34.6 5.9

Compare Models: Cross Validation

Model Scoring

model rmse mae cvg Comp_Time Sig
spTimer 167.7 120.7 100.0 1,980 35_64, 65_and_older, women, hpi, crime_rate
inla 234.9 202.8 22.9 12 35_64, 65_and_older, men, women, crime_rate

Bibliography

Bates, Benjamin R., Alan Templeton, Paul J. Achter, Tina M. Harris, and Celeste M. Condit. 2003. “What Does ‘a Gene for Heart Disease’ Mean? A Focus Group Study of Public Understandings of Genetic Risk Factors.” American Journal of Medical Genetics Part A 119A (2): 156–61. https://doi.org/https://doi.org/10.1002/ajmg.a.20113.
Bunker, Stephen J, David M Colquhoun, Murray D Esler, Ian B Hickie, David Hunt, V Michael Jelinek, Brian F Oldenburg, et al. 2003. ‘Stress’ and Coronary Heart Disease: Psychosocial Risk Factors.” Medical Journal of Australia 178 (6): 272–76. https://doi.org/https://doi.org/10.5694/j.1326-5377.2003.tb05193.x.
Lao, Xiang Qian, Xudong Liu, Han-Bing Deng, Ta-Chien Chan, Kin Fai Ho, Feng Wang, Roel Vermeulen, et al. 2018. “Sleep Quality, Sleep Duration, and the Risk of Coronary Heart Disease: A Prospective Cohort Study with 60,586 Adults.” Journal of Clinical Sleep Medicine 14 (01): 109–17. https://doi.org/10.5664/jcsm.6894.
Syme, S.Leonard, Merton M. Hyman, and Philip E. Enterline. 1964. “Some Social and Cultural Factors Associated with the Occurrence of Coronary Heart Disease.” Journal of Chronic Diseases 17 (3): 277–89. https://doi.org/https://doi.org/10.1016/0021-9681(64)90155-9.
Ulbricht, T. L. V., and D. A. T. Southgate. 1991. “Coronary Heart Disease: Seven Dietary Factors.” The Lancet 338 (8773): 985–92. https://doi.org/https://doi.org/10.1016/0140-6736(91)91846-M.